Computer Vision and Machine Learning in python

-An introduction

Luca Marchesotti - Beautifeye Labs

In [1]:
import matplotlib.pyplot as plt

from IPython.core.display import Image 
 
In []:
from skimage.feature import hog
from skimage import data, color, exposure 
import cv2

Who we are

In [2]:
Image(filename='imgs/Beautifeye-logo-black.png')
Out[2]:

What we do

... we turn vision into decisions

Where we are ?

In [3]:
Image(url='http://pulsosocial.com/en/wp-content/uploads/2012/08/logos-wayra-4-rgb1.jpeg')
Out[3]:

What is computer vision?

In [5]:
from IPython.display import HTML
HTML('<iframe src="http://research.microsoft.com/en-us/groups/vision/" width=900 height=400></iframe>')
Out[5]:

What is machine learning ?

In [6]:
Image(filename='imgs/what-is-machine-learning.png')
Out[6]:

Why Computer Vision and machine learning matter? - The numbers

  • 2.5 Billion pieces of content shared every day on Facebook
  • 500 million Tweets are sent per day
  • 2 billion images are shared every day

Why Computer Vision and machine learning matter? - The money..

  • Google buys Deep Mind startup for 500M
  • IBM invests USD 1 billion in a new division to develop uses for Watson
  • Facebook creates a new Lab for machine learning

Outline

  • Large-scale image classification: Shallow Learning
    • Low level features
    • Mid level features
    • Learning classifiers
  • Large-scale image classification : Deep learning
    • Architecture
  • CV & ML @ Beautifeye
    • Detecting brands on Instagram
    • Learning styles on Google Streetview images

Low-level features - Histogram of Oriented Gradients

Libraries

In [7]:
import matplotlib.pyplot as plt

from skimage.feature import hog
from skimage import data

import cv2

Firstly, let's load a sample image

In [8]:
image_rgb = data.lena()

Then we get rid of non salient information by moving to more manageable grayscale level

In [9]:
image = color.rgb2gray(image_rgb)

.. and we extract the HOG descriptor with the default parameterisation

Note that HOG = Histogram Of Oriented gradients:

  • (optional) global image normalisation
  • computing the gradient image in x and y
  • computing gradient histograms
  • normalising across blocks
  • flattening into a feature vector
In [10]:
fd, hog_image = hog(image,orientations=8, pixels_per_cell=(16, 16),
                    cells_per_block=(1, 1), visualise=True)
In [11]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8*5, 4*5))

ax1.axis('off')
ax1.imshow(image_rgb, cmap=plt.cm.gray)
ax1.set_title('Input image')

# Rescale histogram for better display
hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 0.02))

ax2.axis('off')
ax2.imshow(hog_image_rescaled, cmap=plt.cm.gray)
ax2.set_title('Histogram of Oriented Gradients')
plt.show()

Low-level features - Scale Invariant Feature Transform (SIFT)

In [12]:
gray= cv2.cvtColor(image_rgb,cv2.COLOR_BGR2GRAY)

Compute the transform and detect keypoints

In [13]:
sift = cv2.SIFT()
kp = sift.detect(gray,None)

Draw the keypoints and extract patches

In [14]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8*5, 4*5))
img_rgb_s=cv2.drawKeypoints(gray,kp)

ax1.axis('off')
ax1.imshow(image_rgb, cmap=plt.cm.gray)
ax1.set_title('Input image')

ax2.axis('off')
ax2.imshow(img_rgb_s, cmap=plt.cm.gray)
ax2.set_title('Input image')
Out[14]:
<matplotlib.text.Text at 0x10deab590>
In [15]:
from skimage import io

img_mona   = io.imread('imgs/mona.png')
img_violin = io.imread('imgs/violin.png')
img_bicycle = io.imread('imgs/bicycle.png')


fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8*5, 4*5))


gray= cv2.cvtColor(img_mona,cv2.COLOR_BGR2GRAY)

sift = cv2.SIFT()
kp = sift.detect(gray,None)

img_mona_s=cv2.drawKeypoints(gray,kp)

ax1.axis('off')
ax1.imshow(img_mona, cmap=plt.cm.gray)
ax1.set_title('Input image')

ax2.axis('off')
ax2.imshow(img_mona_s, cmap=plt.cm.gray)
ax2.set_title('Input image')

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8*5, 4*5))

gray= cv2.cvtColor(img_violin,cv2.COLOR_BGR2GRAY)
sift = cv2.SIFT()
kp = sift.detect(gray,None)

img_violin_s=cv2.drawKeypoints(gray,kp)

ax1.axis('off')
ax1.imshow(img_violin, cmap=plt.cm.gray)
ax1.set_title('Input image')

ax2.axis('off')
ax2.imshow(img_violin_s, cmap=plt.cm.gray)
ax2.set_title('Input image')

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8*5, 4*5))

gray= cv2.cvtColor(img_bicycle,cv2.COLOR_BGR2GRAY)
sift = cv2.SIFT()
kp = sift.detect(gray,None)

img_bicycle_s=cv2.drawKeypoints(gray,kp)

ax1.axis('off')
ax1.imshow(img_bicycle, cmap=plt.cm.gray)
ax1.set_title('Input image')

ax2.axis('off')
ax2.imshow(img_bicycle_s, cmap=plt.cm.gray)
ax2.set_title('Input image')
Out[15]:
<matplotlib.text.Text at 0x10bb23290>

Mid-level features : the Bag of Words

In [16]:
img = io.imread('http://upload.wikimedia.org/wikipedia/commons/0/08/Bag_of_words.JPG')
gray= cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

fig, ax = plt.subplots(figsize=(10*2, 20*2))
imgplot = ax.imshow(img)

We take all the patches for the images we want to categorise

In [17]:
Image(filename='imgs/bow-1-patches.png')
Out[17]:

We put all the patches in a "bag" and we shake .. ( cluster)

In [18]:
Image(filename='imgs/bow-2-bags.png')
Out[18]:

We count...

In [19]:
Image(filename='imgs/bow-3-histograms.png')
Out[19]:

Learning a visual classifier

Let's project the BOW vectors on 2-dimensions .. ( careful it is just a toy example)

In [20]:
# we create 40 separable points
np.random.seed(0)
X = np.r_[np.random.randn(20, 2) - [2, 2], np.random.randn(20, 2) + [2, 2]]
Y = [0] * 20 + [1] * 20

plt.scatter(X[:, 0], X[:, 1])
plt.show()
In [21]:
Image(filename='imgs/classification.png')
Out[21]:

Learn a classifier that separates the samples of violin from the bicycles

In [22]:
from sklearn import svm
# fit a linear model
clf = svm.SVC(kernel='linear')
clf.fit(X, Y)
Out[22]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

get the separating hyperplane

In [23]:
w = clf.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(-5, 5)
yy = a * xx - (clf.intercept_[0]) / w[1]
In [24]:
# plot the parallels to the separating hyperplane that pass through the
# support vectors
b = clf.support_vectors_[0]
yy_down = a * xx + (b[1] - a * b[0])
b = clf.support_vectors_[-1]
yy_up = a * xx + (b[1] - a * b[0])

do the boring plotting

In [25]:
plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1],
            s=80, facecolors='none')
plt.scatter(X[:, 0], X[:, 1])
plt.show()
In [26]:
# plot the line, the points, and the nearest vectors to the plane
plt.plot(xx, yy, 'k-')
plt.plot(xx, yy_down, 'k--')
plt.plot(xx, yy_up, 'k--')
 
# plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1],
#             s=80, facecolors='none')
plt.scatter(X[:, 0], X[:, 1], c=Y, cmap=plt.cm.Paired)
 
# plt.axis('tight')
plt.show()

Deep Learning

A Deep Learning method is:

a method which makes predictions by using a sequence of non-linear processing stages.

The resulting intermediate representations can be interpreted as feature hierarchies and the whole system is jointly learned from data.

Marco Aurelio Ranzato, Principal Scientist, FB

In [27]:
from IPython.core.display import Image 
Image('imgs/cnn.png')
Out[27]:

Some examples from Beautifeye

In []: